Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches

نویسندگان

  • Eamonn J. Keogh
  • Michael J. Pazzani
چکیده

The naïve Bayes classifier is built on the assumption of conditional independence between the attributes given the class. The algorithm has been shown to be surprisingly robust to obvious violations of this condition, but it is natural to ask if it is possible to further improve the accuracy by relaxing this assumption. We examine an approach where naïve Bayes is augmented by the addition of correlation arcs between attributes. We explore two methods for finding the set of augmenting arcs, a greedy hillclimbing search, and a novel, more computationally efficient algorithm that we call SuperParent. We compare these methods to TAN; a state-of the-art distribution-based approach to finding the augmenting arcs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

دسته‌بندی پرسش‌ها با استفاده از ترکیب دسته‌بندها

Question answering systems are produced and developed to provide exact answers to the question posted in natural language. One of the most important parts of question answering systems is question classification. The purpose of question classification is predicting the kind of answer needed for the question in natural language. The  literature works can be categorized as rule-based and learning...

متن کامل

ارتقای کیفیت دسته‌بندی متون با استفاده از کمیته‌ دسته‌بند دو سطحی

Nowadays, the automated text classification has witnessed special importance due to the increasing availability of documents in digital form and ensuing need to organize them. Although this problem is in the Information Retrieval (IR) field, the dominant approach is based on machine learning techniques. Approaches based on classifier committees have shown a better performance than the others. I...

متن کامل

Comparison of Estimates Using Record Statistics from Lomax Model: Bayesian and Non Bayesian Approaches

This paper address the problem of Bayesian estimation of the parameters, reliability and hazard function in the context of record statistics values from the two-parameter Lomax distribution. The ML and the Bayes estimates based on records are derived for the two unknown parameters and the survival time parameters, reliability and hazard functions. The Bayes estimates are obtained based on conju...

متن کامل

Bayesian network classifiers which perform well with continuous attributes: Flexible classifiers

When modelling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous works have solved the problem by discretizing them with the consequent loss of information. Another common alternative assumes that the data are generated by a Gaussian distribution (parametric approach), such as conditional Gaussian networks, wit...

متن کامل

Feature-based Malicious URL and Attack Type Detection Using Multi-class Classification

Nowadays, malicious URLs are the common threat to the businesses, social networks, net-banking etc. Existing approaches have focused on binary detection i.e. either the URL is malicious or benign. Very few literature is found which focused on the detection of malicious URLs and their attack types. Hence, it becomes necessary to know the attack type and adopt an effective countermeasure. This pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999